Search CORE

13 research outputs found

The Role of Randomness and Noise in Strategic Classification

Author: Braverman Mark
Garg Sumegha
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 1st Symposium on Foundations of Responsible Computing (FORC 2020)
Publication date: 01/01/2020
Field of study

We investigate the problem of designing optimal classifiers in the strategic classification setting, where the classification is part of a game in which players can modify their features to attain a favorable classification outcome (while incurring some cost). Previously, the problem has been considered from a learning-theoretic perspective and from the algorithmic fairness perspective. Our main contributions include 1. Showing that if the objective is to maximize the efficiency of the classification process (defined as the accuracy of the outcome minus the sunk cost of the qualified players manipulating their features to gain a better outcome), then using randomized classifiers (that is, ones where the probability of a given feature vector to be accepted by the classifier is strictly between 0 and 1) is necessary. 2. Showing that in many natural cases, the imposed optimal solution (in terms of efficiency) has the structure where players never change their feature vectors (the randomized classifier is structured in a way, such that the gain in the probability of being classified as a 1 does not justify the expense of changing one's features). 3. Observing that the randomized classification is not a stable best-response from the classifier's viewpoint, and that the classifier doesn't benefit from randomized classifiers without creating instability in the system. 4. Showing that in some cases, a noisier signal leads to better equilibria outcomes -- improving both accuracy and fairness when more than one subpopulation with different feature adjustment costs are involved. This is interesting from a policy perspective, since it is hard to force institutions to stick to a particular randomized classification strategy (especially in a context of a market with multiple classifiers), but it is possible to alter the information environment to make the feature signals inherently noisier.Comment: 22 pages. Appeared in FORC, 202

arXiv.org e-Print Archive

Princeton University Open Access Repository

Dagstuhl Research Online Publication Server

Extractor-Based Time-Space Lower Bounds for Learning

Author: Garg Sumegha
Raz Ran
Tal Avishay
Publication venue
Publication date: 08/08/2017
Field of study

A matrix

M: A \times X \rightarrow \{-1,1\}

corresponds to the following learning problem: An unknown element

x \in X

is chosen uniformly at random. A learner tries to learn

x

from a stream of samples,

(a_1, b_1), (a_2, b_2) \ldots

, where for every

i

a_i \in A

is chosen uniformly at random and

b_i = M(a_i,x)

. Assume that

k,\ell, r

are such that any submatrix of

M

of at least

2^{-k} \cdot |A|

rows and at least

2^{-\ell} \cdot |X|

columns, has a bias of at most

2^{-r}

. We show that any learning algorithm for the learning problem corresponding to

M

requires either a memory of size at least

\Omega\left(k \cdot \ell \right)

, or at least

2^{\Omega(r)}

samples. The result holds even if the learner has an exponentially small success probability (of

2^{-\Omega(r)}

). In particular, this shows that for a large class of learning problems, any learning algorithm requires either a memory of size at least

\Omega\left((\log |X|) \cdot (\log |A|)\right)

or an exponential number of samples, achieving a tight

\Omega\left((\log |X|) \cdot (\log |A|)\right)

lower bound on the size of the memory, rather than a bound of

\Omega\left(\min\left\{(\log |X|)^2,(\log |A|)^2\right\}\right)

obtained in previous works [R17,MM17b]. Moreover, our result implies all previous memory-samples lower bounds, as well as a number of new applications. Our proof builds on [R17] that gave a general technique for proving memory-samples lower bounds

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

New security notions and feasibility results for authentication of quantum data

Author: Garg Sumegha
Yuen Henry
Zhandry Mark
Publication venue
Publication date: 13/09/2016
Field of study

We give a new class of security definitions for authentication in the quantum setting. These definitions capture and strengthen existing definitions of security against quantum adversaries for both classical message authentication codes (MACs) and well as full quantum state authentication schemes. The main feature of our definitions is that they precisely characterize the effective behavior of any adversary when the authentication protocol accepts, including correlations with the key. Our definitions readily yield a host of desirable properties and interesting consequences; for example, our security definition for full quantum state authentication implies that the entire secret key can be re-used if the authentication protocol succeeds. Next, we present several protocols satisfying our security definitions. We show that the classical Wegman-Carter authentication scheme with 3-universal hashing is secure against superposition attacks, as well as adversaries with quantum side information. We then present conceptually simple constructions of full quantum state authentication. Finally, we prove a lifting theorem which shows that, as long as a protocol can securely authenticate the maximally entangled state, it can securely authenticate any state, even those that are entangled with the adversary. Thus, this shows that protocols satisfying a fairly weak form of authentication security automatically satisfy a stronger notion of security (in particular, the definition of Dupuis, et al (2012)).Comment: 50 pages, QCrypt 2016 - 6th International Conference on Quantum Cryptography, added a new lifting theorem that shows equivalence between a weak form of authentication security and a stronger notion that considers side informatio

arXiv.org e-Print Archive

Cryptology ePrint Archive

The Space Complexity of Mirror Games

Author: Garg Sumegha
Schneider Jon
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 10th Innovations in Theoretical Computer Science Conference (ITCS 2019)
Publication date: 08/10/2017
Field of study

We consider the following game between two players Alice and Bob, which we call the mirror game. Alice and Bob take turns saying numbers belonging to the set {1, 2, ...,N}. A player loses if they repeat a number that has already been said. Otherwise, after N turns, when all the numbers have been spoken, both players win. When N is even, Bob, who goes second, has a very simple (and memoryless) strategy to avoid losing: whenever Alice says x, respond with N+1-x. The question is: does Alice have a similarly simple strategy to win that avoids remembering all the numbers said by Bob? The answer is no. We prove a linear lower bound on the space complexity of any deterministic winning strategy of Alice. Interestingly, this follows as a consequence of the Eventown-Oddtown theorem from extremal combinatorics. We additionally demonstrate a randomized strategy for Alice that wins with high probability that requires only O~(sqrt N) space (provided that Alice has access to a random matching on K_N). We also investigate lower bounds for a generalized mirror game where Alice and Bob alternate saying 1 number and b numbers each turn (respectively). When 1+b is a prime, our linear lower bounds continue to hold, but when 1+b is composite, we show that the existence of a o(N) space strategy for Bob (when N != 0 mod (1+b)) implies the existence of exponential-sized matching vector families over Z^N_{1+b}

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Coding in Undirected Graphs Is Either Very Helpful or Not Helpful at All

Author: Braverman Mark
Garg Sumegha
Schvartzman Ariel
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 8th Innovations in Theoretical Computer Science Conference (ITCS 2017)
Publication date: 23/08/2016
Field of study

While it is known that using network coding can significantly improve the throughput of directed networks, it is a notorious open problem whether coding yields any advantage over the multicommodity flow (MCF) rate in undirected networks. It was conjectured that the answer is no. In this paper we show that even a small advantage over MCF can be amplified to yield a near-maximum possible gap. We prove that any undirected network with k source-sink pairs that exhibits a (1+epsilon) gap between its MCF rate and its network coding rate can be used to construct a family of graphs G\u27 whose gap is log(|G\u27|)^c for some constant c < 1. The resulting gap is close to the best currently known upper bound, log(|G\u27|), which follows from the connection between MCF and sparsest cuts. Our construction relies on a gap-amplifying graph tensor product that, given two graphs G1,G2 with small gaps, creates another graph G with a gap that is equal to the product of the previous two, at the cost of increasing the size of the graph. We iterate this process to obtain a gap of log(|G\u27|)^c from any initial gap

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Time-Space Lower Bounds for Two-Pass Learning

Author: Garg Sumegha
Raz Ran
Tal Avishay
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th Computational Complexity Conference (CCC 2019)
Publication date: 01/01/2019
Field of study

A line of recent works showed that for a large class of learning problems, any learning algorithm requires either super-linear memory size or a super-polynomial number of samples [Raz, 2016; Kol et al., 2017; Raz, 2017; Moshkovitz and Moshkovitz, 2018; Beame et al., 2018; Garg et al., 2018]. For example, any algorithm for learning parities of size n requires either a memory of size Omega(n^{2}) or an exponential number of samples [Raz, 2016]. All these works modeled the learner as a one-pass branching program, allowing only one pass over the stream of samples. In this work, we prove the first memory-samples lower bounds (with a super-linear lower bound on the memory size and super-polynomial lower bound on the number of samples) when the learner is allowed two passes over the stream of samples. For example, we prove that any two-pass algorithm for learning parities of size n requires either a memory of size Omega(n^{1.5}) or at least 2^{Omega(sqrt{n})} samples. More generally, a matrix M: A x X - > {-1,1} corresponds to the following learning problem: An unknown element x in X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a_1, b_1), (a_2, b_2) ..., where for every i, a_i in A is chosen uniformly at random and b_i = M(a_i,x). Assume that k,l, r are such that any submatrix of M of at least 2^{-k} * |A| rows and at least 2^{-l} * |X| columns, has a bias of at most 2^{-r}. We show that any two-pass learning algorithm for the learning problem corresponding to M requires either a memory of size at least Omega (k * min{k,sqrt{l}}), or at least 2^{Omega(min{k,sqrt{l},r})} samples

Dagstuhl Research Online Publication Server

Oracle Efficient Online Multicalibration and Omniprediction

Author: Garg Sumegha
Jung Christopher
Reingold Omer
Roth Aaron
Publication venue
Publication date: 18/07/2023
Field of study

A recent line of work has shown a surprising connection between multicalibration, a multi-group fairness notion, and omniprediction, a learning paradigm that provides simultaneous loss minimization guarantees for a large family of loss functions. Prior work studies omniprediction in the batch setting. We initiate the study of omniprediction in the online adversarial setting. Although there exist algorithms for obtaining notions of multicalibration in the online adversarial setting, unlike batch algorithms, they work only for small finite classes of benchmark functions

F

, because they require enumerating every function

f \in F

at every round. In contrast, omniprediction is most interesting for learning theoretic hypothesis classes

F

, which are generally continuously large. We develop a new online multicalibration algorithm that is well defined for infinite benchmark classes

F

, and is oracle efficient (i.e. for any class

F

, the algorithm has the form of an efficient reduction to a no-regret learning algorithm for

F

). The result is the first efficient online omnipredictor -- an oracle efficient prediction algorithm that can be used to simultaneously obtain no regret guarantees to all Lipschitz convex loss functions. For the class

F

of linear functions, we show how to make our algorithm efficient in the worst case. Also, we show upper and lower bounds on the extent to which our rates can be improved: our oracle efficient algorithm actually promises a stronger guarantee called swap-omniprediction, and we prove a lower bound showing that obtaining

O(\sqrt{T})

bounds for swap-omniprediction is impossible in the online setting. On the other hand, we give a (non-oracle efficient) algorithm which can obtain the optimal

O(\sqrt{T})

omniprediction bounds without going through multicalibration, giving an information theoretic separation between these two solution concepts

arXiv.org e-Print Archive

Memory-Sample Lower Bounds for Learning Parity with Noise

Author: Garg Sumegha
Kothari Pravesh K.
Liu Pengda
Raz Ran
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)
Publication date: 01/01/2021
Field of study

In this work, we show, for the well-studied problem of learning parity under noise, where a learner tries to learn

x=(x_1,\ldots,x_n) \in \{0,1\}^n

from a stream of random linear equations over

\mathrm{F}_2

that are correct with probability

\frac{1}{2}+\varepsilon

and flipped with probability

\frac{1}{2}-\varepsilon

, that any learning algorithm requires either a memory of size

\Omega(n^2/\varepsilon)

or an exponential number of samples. In fact, we study memory-sample lower bounds for a large class of learning problems, as characterized by [GRT'18], when the samples are noisy. A matrix

M: A \times X \rightarrow \{-1,1\}

corresponds to the following learning problem with error parameter

\varepsilon

: an unknown element

x \in X

is chosen uniformly at random. A learner tries to learn

x

from a stream of samples,

(a_1, b_1), (a_2, b_2) \ldots

, where for every

i

a_i \in A

is chosen uniformly at random and

b_i = M(a_i,x)

with probability

1/2+\varepsilon

and

b_i = -M(a_i,x)

with probability

1/2-\varepsilon

(

0<\varepsilon< \frac{1}{2}

). Assume that

k,\ell, r

are such that any submatrix of

M

of at least

2^{-k} \cdot |A|

rows and at least

2^{-\ell} \cdot |X|

columns, has a bias of at most

2^{-r}

. We show that any learning algorithm for the learning problem corresponding to

M

, with error, requires either a memory of size at least

\Omega\left(\frac{k \cdot \ell}{\varepsilon} \right)

, or at least

2^{\Omega(r)}

samples. In particular, this shows that for a large class of learning problems, same as those in [GRT'18], any learning algorithm requires either a memory of size at least

\Omega\left(\frac{(\log |X|) \cdot (\log |A|)}{\varepsilon}\right)

or an exponential number of noisy samples. Our proof is based on adapting the arguments in [Raz'17,GRT'18] to the noisy case.Comment: 19 pages. To appear in RANDOM 2021. arXiv admin note: substantial text overlap with arXiv:1708.0263

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Time-Space Tradeoffs for Distinguishing Distributions and Applications to Security of Goldreich's PRG

Author: Garg Sumegha
Kothari Pravesh K.
Raz Ran
Publication venue
Publication date: 01/01/2020
Field of study

In this work, we establish lower-bounds against memory bounded algorithms for distinguishing between natural pairs of related distributions from samples that arrive in a streaming setting. In our first result, we show that any algorithm that distinguishes between uniform distribution on

\{0,1\}^n

and uniform distribution on an

n/2

-dimensional linear subspace of

\{0,1\}^n

with non-negligible advantage needs

2^{\Omega(n)}

samples or

\Omega(n^2)

memory. Our second result applies to distinguishing outputs of Goldreich's local pseudorandom generator from the uniform distribution on the output domain. Specifically, Goldreich's pseudorandom generator

G

fixes a predicate

P:\{0,1\}^k \rightarrow \{0,1\}

and a collection of subsets

S_1, S_2, \ldots, S_m \subseteq [n]

of size

k

. For any seed

x \in \{0,1\}^n

, it outputs

P(x_{S_1}), P(x_{S_2}), \ldots, P(x_{S_m})

where

x_{S_i}

is the projection of

x

to the coordinates in

S_i

. We prove that whenever

P

t

-resilient (all non-zero Fourier coefficients of

(-1)^P

are of degree

t

or higher), then no algorithm, with

<n^\epsilon

memory, can distinguish the output of

G

from the uniform distribution on

\{0,1\}^m

with a large inverse polynomial advantage, for stretch

m \le \left(\frac{n}{t}\right)^{\frac{(1-\epsilon)}{36}\cdot t}

(barring some restrictions on

k

). The lower bound holds in the streaming model where at each time step

i

S_i\subseteq [n]

is a randomly chosen (ordered) subset of size

k

and the distinguisher sees either

P(x_{S_i})

or a uniformly random bit along with

S_i

. Our proof builds on the recently developed machinery for proving time-space trade-offs (Raz 2016 and follow-ups) for search/learning problems.Comment: 35 page

arXiv.org e-Print Archive

Princeton University Open Access Repository